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(57) Abstract: A video watermarking scheme is disclosed, which is designed for the digital cinema format, as it will be used on 
large projector screens in theaters. The watermark is designed in such a way that it has minimal impact on the video quality, but is 
still detectable after capture with a handheld camera and conversion to, for instance, VHS, CD-Video or DVD format. The proposed 
watermarking system only exploits the temporal axis. This makes it invulnerable to geometrical distortions generally caused by 
such a way of capturing. The watermark is embedded by modulating a global property of the frames (e.g. the mean luminance) 
in accordance with the samples of the watermark. The embedding depth is preferably locally adapted within each frame to local 
statistics of the respective image. Watermark detection is performed by correlating the watermark sequence with extracted mean 
luminance values of a sequence of frames. 
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EMBEDDING AND DETECTION OF WATERMARK IN A MOTION IMAGE SIGNAL 



FIELD OF THE INVENTION 

The invention relates to a method and apparatus for embedding a watermark in 
a motion image signal. The invention also relates to a method and apparatus for detecting 
said watermark. 

5 

BACKGROUND OF THE INVENTION 

A prior-art method of embedding a watermark in a motion image signal is 
disclosed in International Patent Application WO-A-99/45705. In this prior-art method, a 
two-dimensional sequence of watermark samples is completely, and even a plurality of times, 

10 embedded in an image of a video signal. 

Watermark embedding is an important aspect of copy protection strategies. 
Although most copy protection schemes deal with protection of electronically distributed 
contents (broadcasts, storage media), copy protection is also desired for movies being shown 
in theaters. Illegal copying in the cinema by means of a handheld video camera is already 

15 common practice. The quality is usually very low, but the economical impact of illegal VHS 
tapes, CD-Videos and DVDs can be enormous. 

In the coming years, the digital cinema format, 1920x1080x24x36 (pixels/line 
x lines/frame x frames/s x bits/pixel), will be introduced in the theaters. By introducing this 
very high-quality digital format, the threat of illegal copying by handheld video cameras will 

20 be even larger. For this reason, cinema owners are obliged to prevent the presence of video 
cameras on their premises. Not abiding by this rule may be sanctioned with a ban on the 
future availability of content. In view thereof, it is envisioned to add a watermark during 
show time. The watermark is to identify the cinema, the presentation time, operator, etc. 

Most watermark schemes, including the one mentioned in the opening 

25 paragraph, are sensitive to alignment errors at detection time. Solutions have been published 
to either insert the watermark in a domain that is invariant for a certain class of geometrical 
transforms, or to find back the alignment during detection. A disadvantage of these methods 
is that they can generally only cope with a limited number of geometrical transformations. 
Furthermore, these methods usually decrease the robustness to other attacks. 
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The requirements for the digital cinema watermark, similarly as for any other 
watermarking scheme, are: (i) robustness, (ii) imperceptibility and (iii) a low false positive 
rate. Achieving sufficient robustness is the most challenging requirement. The handheld 
camera will not only seriously degrade the video by filtering (the optical path from the 
5 screen to the camera, transfer to tape etc.) but also seriously geometrically distort the video 
(shifting, scaling, rotation, shearing, changes in perspective, etc.). In addition, these 
geometrical distortions can change from frame to frame. 

OBJECT AND SUMMARY OF THE INVENTION 

10 It is an object of the invention to provide a method of embedding a watermark 

in a motion image signal which fulfills the above-mentioned requirements, particularly with 
respect to robustness against geometrical distortions. 

To this end, the method according to the invention comprises the steps of 
determining, for each image, a global property of the pixels constituting said image, and 

1 5 modifying the global property of each image of a sequence of images in accordance with the 
corresponding watermark sample. In a preferred embodiment, said global property is the 
mean luminance of an image. 

It is achieved with the invention that the sequence of watermark samples 
constituting the watermark is distributed in a corresponding sequence of images, one 

20 watermark sample being embedded per image. The method thus embeds the watermark along 
the temporal axis and is therefore inherently robust against all geometrical distortions. 

Since the Human Visual System is sensitive to flicker in low spatial 
frequencies, the watermarked signal may suffer from artifacts especially in non-moving flat 
areas. These artifacts can be significantly reduced, when the flicker frequency of the 

25 watermark is lowered, by embedding the same watermark sample in a fixed number of 

consecutive frames. Furthermore, it is proposed to use an adaptive scheme, where the change 
in luminance for a pixel depends on a local scaling factor, which is determined for every 
pixel. The local scaling factor should be large in moving textured areas and low in non- 
moving flat areas. 

30 The embedded watermark is detected by determining the global properties at 

the detection end, correlating a sequence of global properties with a sequence of reference 
watermark samples, and generating an output signal if the correlation value exceeds a 
predetermined threshold value. 



WO 03/001813 



3 



PCT/IB02/02335 



BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows a schematic diagram of a watermark embedder according to the 

invention. 

Fig. 2 shows a schematic diagram of a watermark detector according to the 

invention. 

DESCRIPTION OF EMBODIMENTS 

Fig. 1 shows a schematic diagram of a watermark embedder according to with 
the invention. The embedder receives a cinema movie in the form of a HDTV video signal 
having a luminance F(x,n) at spatial position x of frame n. The embedder further receives a 
watermark in the form of a pseudo-random sequence w(n) of length N, where w(n)e[-l, 1], 
An appropriate value of N for this application is N=1024. 

In the simplest embodiment of the watermark embedder, the sequence w(n) is 
directly applied to an embedding stage 1 which embeds one watermark sample in every 
frame. In the preferred embodiment, this is performed by increasing the luminance of every 
pixel of frame n by 1 if the watermark sample w(n)=+l , and decreasing by 1 if w(n)=-l . The 
mean luminance of the sequence of frames is thus modulated by the watermark. The 
watermark repeats itself every N frames. 

Other examples of frame parameters that can be modulated by the watermark 
are picture histograms (a list of relative frequencies of luminance values in the picture), or 
features derived therefrom such as high order moments (average of luminance values to a 
power k). The average luminance is a specific example of the latter (k=l). 

Since the Human Visual System (HVS) is sensitive to flicker in low spatial 
frequencies, this simple embodiment may suffer from artifacts especially in non-moving flat 
areas. These artifacts are significantly reduced by lowering the flicker frequency of the 
watermark. This is performed by a repetition stage 2 which repeats each watermark sample 
during a predetermined number T of consecutive images. The same watermark sample is thus 
embedded in a number of consecutive frames. 

The preferred embodiment of the embedder which is shown in Fig. 1 further 
adapts the embedding depth in dependence upon the image contents. To this end, the 
embedder comprises a multiplier 3 which multiplies each watermark sample with a local 
scaling factor v(x,n). The local scaling factor is large in moving textured areas and low in 
non-moving flat areas. To achieve this, the local scaling factor v(x,n) is the minimum of a 



WO 03/001813 PCT/IB02/02335 

4 

spatial-scaling factor X(x,n) and a motion-scaling factor fx(x ? n). Moreover, the result is 
clipped if it exceeds a maximum allowable luminance change v max . This operation is 
performed by a selector 11. 

The spatial adaptation is realized by a spatial adaptation stage, which 
5 comprises a Laplacian filter 4, a multiplier 5, and absolute value calculating means 6. The 
spatial adaptation stage receives the luminance values F(x,n) and generates the local-scaling 
factor A,(x,n) using the absolute value of the response of the Laplacian filter and 
multiplication with a global factor s in accordance with: 
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1 0 The global scaling factor is a trade-off between visibility and robustness. 

The motion-scaling factor |x(x,n) is generated by a motion detector comprising 
a frame memory 7, a subtracter 8, and absolute value calculating means 9. The detector 
calculates the motion-scaling factor |u for every pixel by determining the absolute difference 
with the previous frame. In order to be able to embed a watermark in a non-moving sequence, 

1 5 a small offset jj, min is added by an adder 10 to the absolute frame difference. 

The watermarked frame F w (x,n) is obtained by adding the resulting watermark 
W(x,n) to the original frame F(x,n). It is this watermarked signal F w (x,n) which is projected 
on the cinema screen. 

Fig. 2 shows a schematic diagram of a watermark detector according to the 

20 invention. Although the original signal is available during detection, the detector does not 

utilize any knowledge about the original. The detector receives a recorded version F' w (x 5 n) of 
the watermarked signal F w (x,n). The arrangement comprises luminance extraction means 21, 
which calculates the mean luminance Y(n) of every image n. The extracted luminance values 
Y(n) of NT images are distributed to T buffers 221, 222, where, as described above, T is 

25 the number of consecutive images in which the same watermark sample is embedded. Each 
buffer stores N mean luminance values. Typical values of the watermark length N and the 
frames per watermark sample T are 1024 and 5, respectively. Accordingly, the first buffer 
221 contains mean luminance values Y(l), Y(6), Y(l 1), .., the second buffer 222 contains 
mean luminance values Y(2), Y(7), Y(12), .., etc. This implies that the granularity of 

30 watermark detection is approximately 3 minutes and 25 seconds for PAL video. In order to 
boost the detection, the buffer contents may be filtered with a FIR filter [-12-1] and 
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subsequently clipped between -10 and +10 to equalize the data in the buffers. Said filtering 
and equalizing step is not shown in the Figure. 

The watermark is detected by determining the similarity of each buffer content 
with one or possibly more reference watermarks w(n). Each watermark can, for instance, 
5 identify one theater. A well-known example of similarity is cross-correlation, but other 
measures are possible. The contents of each buffer are cross-correlated with the reference 
watermark in respective correlators 231, 232, ... The correlation is preferably performed 
using Symmetrical Phase Only Matched Filtering (SPOMF). For a description of SPOMF, 
reference is made to International Patent application WO 99/45/45706. In said document, the 
10 correlation is performed in the two-dimensional spatial domain. Blocks of NxN image pixels 
are correlated with an NxN reference watermark. The result of the SPOMF operation is an 
NxN pattern of correlation values exhibiting one or more peaks if a watermark has been 
embedded. 

The T correlators 231, 232, operate in the one-dimensional time domain. The 

15 output of each correlator is a series of N correlation values which is stored in a corresponding 
one of T buffers 241, 242, ... A peak detector 25 searches the highest correlation value in the 
T buffers, and applies said peak value to a threshold circuit 26. If the peak value of at least 
one of the buffers is larger than a given threshold value, it is decided that the watermark is 
present. Otherwise the content will be classified as not watermarked. A suitable threshold 

20 value has been found to be 5 standard deviations, which corresponds to a false alarm 
probability of 1.43 - 10* 6 . 

A payload can be encoded in the signal by embedding shifted versions of the 
watermark w(n) in a manner similar to that disclosed in International Patent Application 
WO-A-99/45705, already cited in the opening paragraph. It should further be noted that, 

25 although T parallel correlators are shown in Fig. 2, it may be advantageous to carry out the 
respective operations in a time-sequential manner. 

A video watermarking scheme is disclosed, which is designed for the digital 
cinema format, as it will be used on large projector screens in theaters. The watermark is 
designed in such a way that it has minimal impact on the video quality, but is still detectable 

30 after capture with a handheld camera and conversion to, for instance, VHS, CD-Video or 

DVD format. The proposed watermarking system only exploits the temporal axis. This makes 
it invulnerable to geometrical distortions generally caused by such a way of capturing. The 
watermark is embedded by modulating a global property of the frames (e.g. the mean 
luminance) in accordance with the samples of the watermark. The embedding depth is 
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preferably locally adapted within each frame to local statistics of the respective image. 
Watermark detection is performed by correlating the watermark sequence with extracted 
mean luminance values of a sequence of frames. 
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1 . A method of embedding a watermark in a motion image signal, the method 
comprising the steps of: 

representing said watermark by a sequence of watermark samples; 

determining, for each image of a corresponding sequence of images, a global property of the 
5 pixels constituting said image; and 

modifying the global property of each image of said sequence of images in accordance with 
the corresponding watermark sample. 

2. A method as claimed in claim 1, wherein said step of modifying comprises 
10 modifying series of a predetermined number of consecutive images in accordance with the 

same watermark sample. 

3. A method as claimed in claim 1 , wherein said global property is the mean 
luminance of the pixels constituting an image. 

15 

4. A method as claimed in claim 1, wherein said step of modifying includes 
adaptively modifying the pixels of an image in dependence upon spatial activity within said 
image. 

20 5. A method as claimed in claim 1 , wherein said step of modifying includes 

adaptively modifying the pixels of an image in dependence upon motion detected between 
consecutive images. 

6. A method of detecting a watermark in a motion image signal, the method 

25 comprising the steps of: 

providing a reference watermark as a sequence of reference watermark samples; 
determining, for each image of a corresponding sequence of images, a global property of the 
pixels constituting said image; 
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correlating the sequence of reference watermark samples with a corresponding sequence of 
said global properties; and 

generating an output signal if said correlating step yields a correlation value exceeding a 
predetermined threshold value. 

5 

7. A method as claimed in claim 6, further comprising the steps of providing a 
series of interleaved sequences of global properties, wherein the step of correlating comprises 
correlating the sequence of reference watermark samples with each one of said interleaved 
sequences of said global properties to obtain a series of correlation values, and the step of 

10 generating an output signal comprises generating the output signal if the largest of said 
correlation values exceeds a predetermined threshold value. 

8. A method as claimed in claim 6, wherein said global property is the mean 
luminance of the pixels constituting an image frame. 

15 

9. An apparatus for embedding a watermark in a motion image signal, the 
apparatus comprising: 

means for providing said watermark as a sequence of watermark samples; 
means for determining, for each image of a corresponding sequence of images, a global 
20 property of the pixels constituting said image; and 

means for modifying the global property of each image of said sequence of images in 
accordance with the corresponding watermark sample. 

10. An apparatus for detecting a watermark in a motion image signal, the 
25 apparatus comprising: 

means for providing a reference watermark as a sequence of reference watermark samples; 
means for determining, for each image of a corresponding sequence of images, a global 
property of the pixels constituting said image; 

means for correlating the sequence of reference watermark samples with a corresponding 
30 sequence of said global properties; and 

means for generating an output signal if said correlating step yields a correlation value 
exceeding a predetermined threshold value. 
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11. A motion image signal with an embedded watermark represented by a 

sequence of watermark samples, characterized in that, for each image of a sequence of 
images, a global property of the pixels constituting said image has been modified in 
accordance with a respective watermark sample of the corresponding sequence of watermark 
5 samples. 



12. 

in claim 1 1 . 



A storage medium having recorded thereon a motion image signal as claimed 
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